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The WEITEK XL-Series proces¬ 
sors are high-speed 32-bit CMOS 
numeric RISC processors with sus¬ 
tained integer performance of up to 
7 MIPS, and sustained floating 
point performance of up to 5 
MFLOPS. 



These processors are supported by a 
complete software development en¬ 
vironment, including C and FOR¬ 
TRAN compilers, an assembler, 
program and functional simulators, 
a debugger, and a board-level de¬ 
velopment system. 
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XL-SERIES 

OVERVIEW 


Features 

HIGH-SPEED CMOS PROCESSORS 

XL-8000: 7 MIPS integer processor 

XL-8032: 7 MIPS, 5 MFLOPS single-precision 
floating point processor 

XL-8064: 7 MIPS, 5 MFLOPS double-precision 
floating point processor 

RISC ARCHITECTURE 
Single-cycle execution 

Three-address, register-to-register instructions 
32-word register files 

Separate code and data memories for high 
memory bandwidth 

Vectored interrupts 
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DEVELOPMENT TOOLS 

Industry-standard C and FORTRAN 77 compilers 
Assembler, linker, and debugger 
Functional and architectural simulators 
Development system 

RICH INSTRUCTION SET 

Full set of arithmetic and logical functions 

Single-cycle bitwise merge, field extract, field 
insert, and field merge 

Pre- and post-incrementing indexed addressing 
Sophisticated program control instructions 


Description 


The XL Series is a family of three VLSI RISC proces¬ 
sors: the XL-8000, a high-speed 32-bit processor; the 
XL-8032, a single-precision floating point processor 
with all the features of the XL-8000; and the XL-8064, 
a double-precision floating point processor with all the 
features of the XL-8000 plus a full implementation of 
32- and 64-bit IEEE arithmetic. 

All XL-Series processors have a 32-element register 
file, a 33-element program control stack, an on-chip 
integer multiply/divide unit, and a complete set of 


arithmetic, bit manipulation, address generation, and 
control instructions. The XL-8032 and XL-8064 also 
have floating point units with their own on-chip register 
files. 

These processors give high performance and are avail¬ 
able with a full complement of development tools, in¬ 
cluding C and FORTRAN 77 compilers, assembler, de¬ 
velopment system, and hardware and software 
simulators. 



Figure 1. Simplified block diagram of an XL-series processor 



The XL-Series Processor Family 


XL-8000 PROCESSOR 


The XL-8000 is a general-purpose 32-bit integer RISC 
processor with enhancements to support high-speed 
bit-manipulation, address generation, and arithmetic. 
It achieves a sustained processing rate of 7 MIPS (mil¬ 
lions of instructions per second), with a peak of 
10 MIPS. 

The XL-8000 is useful in applications that require 
high-speed integer processing, such as 2-D graphics, 
logic simulation, communications, and control. 

XL-8032 PROCESSOR 

The XL-8032 is a 32-bit RISC floating point processor 
that achieves a sustained processing rate of 7 MIPS 
and 5 MFLOPS (millions of floating point operations 
per second) in the 32-bit IEEE format, with a peak 
floating point rate of 20 MFLOPS. It has the same in¬ 
teger instruction set as the XL-8000. 

The XL-8032 is an ideal processor for applications that 
need fast single-precision floating point, such as graph¬ 


ics transformation or digital signal processing. 
XL-8064 PROCESSOR 

The XL-8064 is a 64-bit RISC floating point processor 
that achieves a sustained processing rate of 7 MIPS 
and 5 MFLOPS in either single- or double-precision 
IEEE floating point formats, with a peak floating point 
rate of 20 MFLOPS. The floating point unit is a full 
implementation of the IEEE floating point standard. 
The XL-8064 has the same integer instruction set as 
the XL-8000. 

There are two versions of the XL-8064: the XL-8164, 
which has a 32-bit data bus, and the XL-8364, which 
has a 64-bit data bus. 

The XL-8064 is ideal for applications that need dou¬ 
ble-precision floating point, such as solids modeling, fi¬ 
nite analysis, circuit simulation and general-purpose 
scientific computing. 


Feature 

XL-8000 

XL-8032 

XL-8064 

Floating point 

software 

32-bit 

32- or 64-bit 

Capability 


lEEE-format 

Full IEEE implementation 

Code Bus 

32 bits 

64 bits 

64 bits 

Data Bus 

32 bits 

32 bits 

32 or 64 bits 

Speeds 

100, 120 ns 

100, 120 ns 

100, 120 ns 

Peak MIPS 

10 

10 

10 

Sustained MIPS* 

7 

7 

7 

Peak MFLOPS 

- 

20 

20 

Sustained MFLOPS 

- 

5 

5 

Number of VLSi 
components 

2 

3 

3 

Maximum Code 
Bandwidth 

40 MB/sec 

80 MB/sec 

80 MB/sec 

Maximum Data 
Bandwidth 

40 MB/sec 

40 MB/sec 

80 MB/sec 

* Sustained MIPS give performance relative to a VAX 11/780, which has a sustained performance of 1.0 MIPS. 

All performance numbers are for the fastest speed grade. 



Figure 2. Comparison of the XL-Series processors 
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Related Documents 

XL-SERIES COMPONENT DATA SHEETS 

Data sheets for the XL-8136 program sequencing unit, 
the XL-8137 integer processing unit, and the XL-3132 
and XL-3164 floating point units. 

XL-SERIES PROGRAMMER’S BINDER 

This binder contains software and programming infor¬ 
mation, including descriptions of software tools, pro¬ 
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gramming techniques, compilers, and the XL-Series in¬ 
struction set. 

XL-SERIES SYSTEM DESIGNER’S BINDER 

This binder contains information about XL-Series 
hardware design, systems software, functional simula¬ 
tors, and porting the XL software to a target system. 


XL-Series Software 

The XL-Series software development environment is 
available for SLFN-3 and VAX systems under 4.2 BSD 
UNIX, and the Compaq 386 under the XENIX/386 
operating system. 

COMPILERS 

The XL-Series compilers are advanced optimizing 
compilers for C and FORTRAN 77. Each is compatible 
with an industry-standard version of the language—the 
C compiler is compatible with the UNIX"^ portable C 
compiler, and the FORTRAN compiler conforms to 
the ANSI FORTRAN 77 standard. 

These compilers use a variety of techniques to increase 
the speed and reduce the size of the program, includ¬ 
ing automatic allocation of register variables, loop rota¬ 
tion, strength reduction, register coalescing, and static 
address elimination. 

PARALLELIZER 

The XL-Series parallelizer takes the output of the com¬ 
piler and performs a series of optimizations that take 
advantage of the XL-Series architecture. It places in¬ 
structions in parallel when possible, makes use of 
shadow instructions, and takes advantage of the float¬ 
ing point processor’s pipelines. The output of the paral¬ 
lelizer is XL-Series assembly code. 

The parallelizer recognizes the capabilities of each of 
the XL-series processors. For example, it converts 
double-precision floating point instructions to single¬ 
precision for the XL-8032, which has no double-preci¬ 
sion floating point. For the XL-8000, which has no 


floating point processor, the parallelizer replaces float¬ 
ing point instructions with calls to routines in a software 
floating point library. 

ASSEMBLER 

The XL-Series assembler converts assembly-language 
instructions into an object module. The assembly lan¬ 
guage allows exact specification of what happens in 
each cycle. Speed-critical routines or entire applica¬ 
tions can be written in assembly language for maximum 
performance. Routines written in assembly language 
and high-level languages can be mixed freely within an 
application. 

LINKER AND LIBRARIAN 

The XL-Series linker joins multiple object files into a 
single executable file. It allows modules compiled at 
different times to be joined together, and can also link 
assembly-language modules with compiled modules. 
The starting addresses of the code and data segments 
can be specified, and the linker can produce ROM- 
able code. 

The librarian allows a set of modules to be combined 
into a single file, from which the desired routines can 
be extracted by the linker. 

SOFTWARE SIMULATOR 

The XL-Series software simulator is a program that al¬ 
lows applications to be tested in the absence of a work¬ 
ing XL system. Programs can be loaded, executed, and 
debugged on the simulator. 
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XL-Series Software, continued 


FUNCTIONAL SIMULATORS 

The functional simulators model the behavior of the 
XL-Series devices, giving the logic levels on each pin at 
four points during every clock cycle (before and after 
the rising and falling edges of the clock). Each simula¬ 
tor models the performance of one XL component. 


and can be integrated into architectural or timing simu¬ 
lators. The simulators are routines written in C, and are 
used with the designer’s simulation routines to simulate 
XL hardware designs, and to analyze the behavior of 
the XL components. 


XL-Series Development System 

The XL-Series Prototype Development System consists 
of software and the XL-Series development board, 
which plugs into a Compaq Deskpro 386™ personal 
computer. Programs written for the XL-Series proces¬ 
sors can be run on the board in a UNIX-like environ¬ 
ment. The board uses the personal computer for con¬ 
sole and file I/O. 


The board has extra connectors and room for wire- 
wrap sockets to allow it to be used as a prototype of 
new designs. 

All XL-Series development software runs on the Com¬ 
paq Deskpro 386™ under the XENIX/386 operating 
system. 
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The rest of this document describes the XL series from 
three points of view: that of the applications program¬ 
mer, that of the system programmer, and that of the 
hardware designer. 

RISC ARCHITECTURE 

The XL-Series processors are true 32-bit processors 
that use an extended RISC (Reduced Instruction Set 
Computer) architecture. They have the following in 
common with other RISC machines: 


MEMORY ARCHITECTURE 

Word-oriented architecture. The 32-bit word is the ba¬ 
sic data type of the machine, and all integer instruc¬ 
tions produce results of that size. Manipulation of 
fields smaller than a word must be done in registers. 
Figure 3 shows how memory is addressed and how 
bytes are ordered within a word. 


PARALLELISM 


Register-to-register, three-address instructions. Both 
integer and floating point instructions are register-to- 
register instructions, where two source registers and 
one destination register can be specified in a single in¬ 
struction. 

Load-store architecture. Accompanying the register-to- 
register concept is the idea that memory accesses are 
simple load register or store register instructions. Mem¬ 
ory accesses are broken down into two instructions: ad¬ 
dress generation and data transfer. Address generation 
and data transfer instructions can be overlapped to 
achieve one load per cycle. 

Single-cycle execution. All integer instructions except 
multiply and divide complete in a single cycle. Floating 
point instructions (except for floating point divide) 
take no more than four cycles. 

Pipelined execution. The floating point units are 
pipelined to allow a new operation to be started on 
every cycle. 

Large register file. There are 32 general-purpose inte¬ 
ger data registers. This large orthogonal register file al¬ 
low memory accesses to be reduced by maintaining 
variables and passing parameters in registers instead of 
memory. The XL-8032 and XL-8064 each have a 
32-element floating point register file as well. 


The functional units in the XL processors operate in 
parallel, allowing floating point, integer, memory, and 
control operations to occur simultaneously. There are 
three fields in the instruction word, called the sequen¬ 
cer field, the integer field, and the floating point field. 
The sequencer field is eight bits wide, the integer field 
is 24 bits wide, and the floating point field is 32 bits 
wide (for a total of 64 bits). Most register-to-register 
operations use only the integer field, most flow-of-con- 
trol instructions use only the sequencer field, and a few 
instructions use both fields. Floating point arithmetic 
instructions occupy the floating point field. The 
XL-8000 doesn’t have a floating point field, and so its 
code word is only 32 bits wide. 

In general, any instructions that don’t have overlapping 
fields and don’t cause resource conflicts can execute in 
parallel (resource conflicts occur when two instructions 
try to use the same bus or register in conflicting ways). 
Thus a short branch instruction can occur in the same 
cycle as a bitwise merge instruction, since the short 
branch uses only the sequencer field and the bitwise 
merge uses only the integer field. Conditional branches 
use the condition code generated by the operation in 
the integer field to determine whether to branch or 
not. This allows the test and the branch to occur in the 
same cycle. 


bits: 

significance: 

31..24 

Most 

23..16 

Next-to-most 

15..8 

Next-to-least 

7..0 

Least 

byte: C 

11 

1 10 1 

01 1 

00 

halfword: 


_1 

1 00 



1 01 r 


c 


10 1 

1 


word: [I 


00 



Figure 3. Data memory addressing and byte ordering 
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Registers 


DATA REGISTERS 

There are thirty-two 32-bit integer data registers, num¬ 
bered .r0-.r31. These are general-purpose data regis¬ 
ters, any of which can be used as the source or destina¬ 
tion for integer register-to-register operations. 

FLOATING POINT REGISTERS 

The XL-8032 and XL-8064 have thirty-two floating 
point data registers, numbered In the 

XL-8032, these registers are 32 bits wide. In the 
XL-8064, they are 64 bits wide. The floating point reg¬ 
isters are general-purpose data registers, any of which 
can be used as the source or destination for any float¬ 
ing point operation. 

PRODUCT REGISTERS 

The two 32-bit product registers^ .am and .al, are used 
by the multiply, divide, and bitwise merge instructions. 

FIELD LENGTH REGISTER 

The five-bit field length register is used by the dynamic 
bit-manipulation commands (extract, deposit, and 


merge) to specify the length of the field to be operated 
on. 

SHIFT AMOUNT REGISTER 

Like the field length register, the shift amount register 
is a five-bit register used by the dynamic bit-manipula¬ 
tion commands. It specifies the amount of shifting 
(0-31 bits) to be applied to the desired bit field. 

CARRY BIT 

The carry bit contains the carry from the last arithme¬ 
tic operation that generated a carry. 

STACK 

The processor has a 33-word by 32-bit stack for loop 
counts, branch addresses, subroutine return addresses, 
and data transfers. The stack consists of a 32-bit top- 
of-stack register and a 32-word by 32-bit RAM. Over¬ 
flow and underflow trap handlers allow the stack to be 
extended to arbitrary size in data memory. 


Instruction Set 

ARITHMETIC FUNCTIONS 

The arithmetic instructions consist of signed and un¬ 
signed addition and subtraction, with and without 
carry. One of the operands can be a five-bit immediate 
instead of a register. 

add .r0,.r1,.r2 # Add .rO to .r1, store 

# result to .r2 

add i.r30,9,.r26 # Add a 5-bit immediate 

# to .r30, store result 

# to .r26 

subc .r1 ,.r30,.r12 # Subtract .r30 from .r1 

# with carry, store result 

# to .r12 


MULTIPLY AND DIVIDE 

A 32-bit signed multiply is performed in eight cycles; a 
64/32 bit mixed-precision unsigned divide is done in 
twenty cycles. Multiplication gives a 64-bit product; di¬ 
vision gives a 32-bit quotient and 32-bit remainder. 

A multiply or divide can be done in parallel with any 


operations (except instructions that use the .am or .al 
registers). 


# Example of Multiplication 


r1 ,.r2 


mpy 
nop 
nop 
nop 
nop 
nop 

mov .al, 
mov .am 


,r3 


.r4 


# Multiply regs .r1 and .r2. 

# Wait for result. 

# Useful work could be 

# done here Instead of 

# no-ops. 

# Retrieve low-order 32 bits. 

# High-order 32 bits 


FLOATING POINT ARITHMETIC 

The XL-8032 and XL-8064 have floating point arith¬ 
metic instructions, including addition, subtraction, 
multiplication, integer to floating point conversion, 
floating point to integer conversion, etc.. The XL-8032 
uses the divide lookup table instruction (flut) and a 
Newton-Raphson approximation to perform floating 
point division. The XL-8064 performs floating point di¬ 
vision and square root directly in hardware. 
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Floating point instructions (except divide and square 
root) take three clock cycles to complete on the 
XL-8032, and two cycles to complete on the XL-8064. 

Floating point operations can be overlapped (pipe- 
lined). A new floating point operation can be started in 
every cycle, without waiting for the operations in pro¬ 
gress to complete. Loads and stores to the floating 
point unit can occur in parallel with floating point arith¬ 
metic. 

# Pipelined multiplies on the XL-8032. 

# One floating point operation can be 

# started in every cycle, and loads 

# and stores can occur In parallel 


fmul 

•fo, .f1. .f2 

: fload .f9 

fmul 

.f3, .f4. .f5 

: fload .flO 

fmul 

.f6. .f7, .f8 

; fload .f12 

fmul 

.f9, .flO, .f11 

; fload .f13 

fmul 

.f12, .f13, .f14 

: fstore .f2 


LOGICAL FUNCTIONS 

The processor performs the complete set of sixteen bit¬ 
wise logical operations, including and, or, xor, not, 
nand, etc. 


posit operations fill the bits outside the field with zeros. 
Extract operations can either zero-extend or sign-ex- 
tend the extracted value. Merge operations merge the 
bit field into the target register, leaving bits outside the 
field unmodified. These instructions are illustrated in 
figure 4. 

The basic forms of these instructions use immediate 
values for the field length and shift amount parameters. 
Dynamic extract, deposit, and merge use the values in 
the shift amount and field length registers. This gives 
greater flexibility, but generally takes three cycles per 
operation instead of one, since the shift amount and 
field length registers must be set up with mov instruc¬ 
tions. 

# Extract 

ext .r15,3,12,.r2 

# Typical dynamic extract 

mov .rO, .sar # Set shift amount 

mov .r1, .fir # Set field length 

ext .r15, .sar, .fir, .r3# Do the extract 

Simple left and right shifts are done with the deposit 
and extract commands, respectively. Rotates can be 
done in two cycles with a combination of two field op¬ 
erations. 


EXTRACT/DEPOSIT OPERATIONS 

The processor has a 32-bit field shifter that can per¬ 
form field extract, merge, and insert operations in a 
single cycle. Deposit takes fields aligned at bit zero and 
converts them to unaligned fields; Extract takes un¬ 
aligned fields and converts them to aligned fields. De¬ 


PRIORITY ENCODE (FIND FIRST ONE) 

This instruction counts the number of zero bits that 
precede the most-significant one bit in a register. This 
can be used in applications where data is bit-encoded 
in order of priority. 


Deposit 

31 0 

Merge 

31 0 31 0 

RB I 

RB 1 RA 


Xl"* . ► 7 

L ^ - f - ► S 

L ^ i ^ ^ ^ ^ 

RA 0 T 0 







7 


© 1988 WEITEK Corporation 
All rights reserved 


















Instruction Set, continued 

PERFECT EXCHANGE 


This instruction is used to swap fields or reverse the bit forms. The perfect exchange operation is controlled by 
order on 2, 4, 8, 16, or 32-bit fields. One use of bit a 5-bit p field in the instruction. See figure 5. 
reversal is to calculate addresses in Fast Fourier Trans- 



31 



0 

p=11111: 

Reverse all bits in word 

Original 

first 

second 

third 

fourth 


dtiuof 

blidf 

bnoc92 

fen if 


p=11000: 

Reverse Byte Order 


p=10000: 

Reverse halfword order 


fourth 

third 

second 

first 


third 

fourth 

first 

second 


p=00111: 

Reverse bits within byte fields 

p=01111: 

Reverse bits within halfwords 



bnoo92 

tnidf 

rifiuof 


bfl039Z 

jziif 

dtnuof 

biidt 












Figure 5. Perfect exchange 


Memory Access Instructions 

ADDRESS GENERATION 

The address generation instructions provide the follow¬ 
ing addressing forms: base, base plus displacement, 
base plus index, and base plus scaled index. All of 
these exist in both pre-modified and post-modified 
forms. Address generation instructions take a signed 
value from an immediate field or register, shift it left by 
0-3 bits, and add it to a base register—optionally writ¬ 
ing the result to another register. The address may be 
either the result of the addition or the contents of the 
base register before the addition. 

# .byte, .half, and .word correspond to a shift of 

# 0, 1, and 2 bits, respectively. 

addr .r20,.r30.word # Basic addressing inst: 

# add and drive the sum 

# onto the AD bus. 

+addr .r20,.r30.byte,.r10 # As above, but also 

# store the sum in .r10 

addr+ .r20,.r30,.byte,.r10 # Drive .r20 to AD bus, 

# then do the add. Store 

# the sum in .r10. 

LOAD 

This instruction places a word from memory into the 


specified register. The load must be preceded by an 
addressing instruction. Another operation can be done 
in parallel with load, so long as it doesn’t also try to 
access memory or the register being loaded. 

# Load example 

addr .r14, 0, .word # generate the address 
subi 15, .rO, .r1 ; load .r23 # Do a subtract, 

# while at the same time 

# loading data into .r23. 

Byte Align For Load, The Processor always loads 32-bit 
words. Align takes a smaller field, such as a byte, and 
aligns (and optionally sign-extends) it to fill the entire 
word. The two-bit size field determines the number of 
bytes to read, and the .adr register is used to deter¬ 
mine alignment. Note that this instruction is a register- 
to-register instruction, and doesn’t do the actual load¬ 
ing. 

# Byte align example 

addr .r14, 0, .byte # generate the address 
load .rO # load the data 

align .rO, .r1, .byte # Take a byte from .rO, 

# align and store to .r1 
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The addressing instruction and the align instruction 
must agree on the size of the data being loaded. 

Floating Point Loads. Floating point loads work ex¬ 
actly like integer loads, but the data goes to the register 
file on the floating point unit instead of the integer 
unit. 

Byte Align And Store. This instruction takes a single 
field of 1-4 bytes from a register and stores it to mem¬ 
ory, optionally checking for sign bit overflow. 

Floating Point Stores. Floating point stores take a reg¬ 
ister in the floating point unit and store it to memory. 
Floating point stores can occur in parallel with floating 
point arithmetic operations. 

STORE 

# Example of store 

addr .r14, 0, .word # Generate the address 

The store instruction stores the result of the current 
processor operation to memory. A store instruction 
must be preceded by an addressing instruction. This 
instruction always stores a full 32-bit word. 

subc .r1, .r2, .r3 ; store # Do a subtract, 

# and store the result to 

# memory at the address 

# in register .r14 


Program Control Instructions 


Program control instructions come in two formats: 
short and long. Short control instructions use the 8-bit 
sequencer field. Short instructions include neutraliza¬ 
tion control, short branches, and some loop control in¬ 
structions. 

Long control instructions use both the integer field and 
24-bit sequencer field. These instructions are used for 
long branches, subroutine calls, register transfer, and 
miscellaneous operations. 

BRANCH INSTRUCTIONS 

All branch instructions are relative to the current in¬ 
struction. 

Br (branch) is a long-format instruction that specifies a 
24-bit displacement relative to the current instruction. 

Shbr (short branch) is a short-format instruction that 
branches forwards or backwards in the range of ~16 
to +15 instructions. An integer operation can be per¬ 
formed in parallel with the branch. 

Brc (conditional branch) is a short-format instruction 
that branches conditionally if the parallel integer op¬ 
eration satisfies the test condition. Its range is 0-31 in¬ 
structions. 

Fbr (floating point branch) is like brc, but uses the 
floating point condition to determine whether to 
branch or not. 


Brstkp (branch to stack and pop) branches by the dis¬ 
placement on the top of stack, then discards the top- 
of-stack value. Brstkp is a short-format instruction. 

CALL AND RETURN INSTRUCTIONS 

Bsr (branch to subroutine) pushes the address of the 
next instruction on the stack and performs a signed 
28-bit-displacement branch. It is a long-format instruc¬ 
tion. 

Bsrstk (bsr to stack) is used for dynamic subroutine 
calls. It branches to the absolute address on the top of 
stack. The address of the next instruction replaces the 
value on the top of stack. It is a short-format instruc¬ 
tion. 

Rts (return from subroutine) uses the top of stack as an 
absolute jump address. The value is then popped off 
the stack. It is a short-format instruction. 

LOOPING INSTRUCTIONS 

Loop pushes the next instruction address onto the top 
of stack. It is a short-format instruction. 

Endloop is a short-format instruction that branches to 
the address on the top of stack if the parallel integer 
operation satisfies the test condition. Otherwise, the 
address is popped off the stack and the loop is exited. 
The address is typically put there by a loop instruction. 

Fndloop is like endloop, but uses the floating point 
condition to determine whether to branch or not. 
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Sob (subtract one and branch) subtracts one from the 
top of stack. If the result is non-zero, a branch is made 
by the displacement given in a 24-bit sign-extended im¬ 
mediate value. If the result is zero, the stack is popped, 
and normal sequential execution resumes. This is use¬ 
ful at bottoms of loops designed to continue a set num¬ 
ber of iterations. It is a long-format instruction. 

Shsob (short sob) is similar to sob, but branches are 
specified with a one-extended (i.e., negative) five-bit 
immediate, rather than a sign-extended 24-bit immedi¬ 
ate. It is a short-format instruction 

Brp (branch and pop) Branches by a sign-extended 
24-bit immediate displacement. The value on the top 
of stack is popped off and discarded. Used to exit 
loops prematurely. Brp is a long-format instruction. 

NEUTRALIZATION 

One reason for the processor’s high speed is that it 
fetches the next instruction at the same time it executes 
the current instruction. This means that the next in¬ 
struction has already been fetched when it becomes 
time to execute it. 

When a branch is executed, however, the processor 
has the instruction following the branch in its instruc¬ 
tion pipeline—not the instruction at the destination ad¬ 


dress. The instruction that has been fetched is called 
the “shadow instruction.” Fetching the correct instruc¬ 
tion takes an additional cycle (since it’s not yet in the 
pipeline) so the destination instruction is executed after 
a one-cycle delay. This is called “delayed branching.” 

The processor normally neutralizes the cycle following 
taken branch, call, and return instructions. Neutraliza¬ 
tion effectively turns an instruction into a no-op. 

The processor instruction set also provides three addi¬ 
tional instructions: override neutralization, override 
neutralization and increment stack pointer, and reverse 
neutralization (ovneut, ovneuti, and revneut). Effi¬ 
cient code—such as that produced by the XL compil¬ 
ers—makes use of these instructions to selectively exe¬ 
cute shadow instructions, saving one clock cycle per 
branch. 

TRAPI INSTRUCTION 

The trapi instruction is used with an 11-bit immediate 
to make system calls. The immediate value is pushed 
onto the stack, and a software interrupt is generated. 
The interrupt service routine uses the value on the 
stack to determine which system call is required. This 
allows user-mode routines to request supervisor-mode 
services in an orderly manner. 


Parallelism 

Since there are three fields in the instruction word, a 
maximum of three operations may be specified for any 
one instruction cycle—typically an integer operation, a 
control operation, and a floating point operation. All 
three will execute in parallel. 

It’s possible to have more than three operations in pro¬ 
gress during a cycle, however, since the integer^multi- 
ply/divide unit will work in parallel with other instruc¬ 
tions, and several instructions can be in the pipeline of 
the floating point unit. 

# Example of parallelism on the XL-8032 

mpy .r1,.r2 # Start an integer multiply 

fmul .f0,.f1,.f2# Start a floating point mpy 
fadd .f8,.f9,.f7# Start a floating point add 

# Now start three operations at once: 

add .r25,6,.r2 ; bne LABEL ; fsub .f11 ,.f12,.f13 


In the last cycle of this example, there are six opera¬ 
tions going on at once: Three floating point operations 
(a multiply, which finishes in the next cycle, an add, 
which will finish one cycle later, and a subtract, which 
will finish two cycles later), an integer multiply (execut¬ 
ing in the independent multiply/divide unit), an integer 
add, and a conditional branch (which tests the result of 
the integer add). 
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The systems programmer is in control of interrupts, 
system initialization, privileged instructions, and debug¬ 
ging. The XL-Series processors have a number of fea¬ 
tures that make systems programming easier, including 
an extra bank of data registers, a trap that signals im¬ 
minent stack overflow or underflow, vectored inter¬ 
rupts, individual and master interrupt enables, and a 
breakpoint/watchpoint register. 

SUPERVISOR MODE 

The processor enters supervisor mode on reset and 
when it honors an interrupt. A number of instructions 


are accessible only from supervisor mode, including 
most instructions that modify special registers directly 
(user-mode instructions are also available in supervisor 
mode). User programs (which in this case means any 
code that doesn’t run in supervisor mode) are ex¬ 
pected to request system services by using trap! instruc¬ 
tions, which perform software interrupts—and thus put 
the processor into supervisor mode. On the return from 
the trap handler, supervisor mode is restored to its pre¬ 
vious state. 

A privilege violation trap occurs when a user-mode 
program attempts to execute a supervisor-mode in¬ 
struction. 


Registers 

PROGRAM COUNTERS 

There are two 32-bit program counters: the currently 
executing address (.cea), the true program counter; 
and the currently fetching address (.cfa), which is the 
address from which a code word is being fetched for 
later execution. Branches are taken relative to the cur¬ 
rently executing address. 

INTERRUPT ADDRESS REGISTERS 

There are four interrupt address registers: the interrupt 
base register (.ibr), the interrupt last address (.ila), 
the interrupt fetch address register (.ifa) and the inter¬ 
rupt execute address register (.iea). The interrupt base 
register contains the address of the interrupt vector ta¬ 
ble, which is a 64-word table in code memory. The 
interrupt last address contains the address of the last 
instruction that was successfully executed. The inter¬ 
rupt fetch and execute registers hold the old contents 
of .cea and .cfa during interrupt processing. 

ADDRESS HOLDING REGISTER 

The processor retains the last address generated by any 
of the address generation instructions in the .adr regis¬ 
ter, so it can re-assert the address in the event of an 
interrupt. 

SECOND REGISTER BANK 

Registers .r28“.r31 are duplicated in a second bank, 
which is swapped in and out when the z bit in the proc¬ 
essor status register is toggled. These registers (named 
.r28'-.r31') are used to save the state of the machine 
during interrupt processing. The second bank is illus¬ 


trated in figure 6. Note that .r0-.r27 are always acces¬ 
sible, regardless of the state of the z bit. 

STACK POINTER 

The five-bit stack pointer (.tos) is a modulo 32 counter 
which increments on each push and decrements on 
each pop. 

An underflow exception occurs when a pop operation 
empties the stack. An overflow exception is generated 
when a push operation nearly fills the stack. 
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Registers, continued 


A pair of exception routines can implement a larger 
stack in system memory. When the processor stack 
overflows, it can be copied to the main memory stack; 
when it underflows, data in the memory stack can be 
restored to the processor stack. 

PROCESSOR STATUS REGISTER 

The processor retains some control information in the 
processor status register (.psr). Important fields in the 
processor status register are the carry bit (c), the field 
length register (.fir), which is used in bit-field-manipu¬ 
lation instructions; the shift amount register (.sar), 
which is also used in field-manipulation instructions; 
and the register bank toggle (z). The processor status 
register is shown in figures 7 and 8. 

SEQUENCER STATUS REGISTER 

The sequencer status register (.ssr) includes the five- 
bit top-of-stack register (.tos), the supervisor mode 
and branch bits, ten sets of flag/enable bits which con¬ 
trol and identify the state of interrupts and exceptions, 
and the master interrupt enable bit. If the master en¬ 
able bit is cleared, all interrupts are disabled. 

Instructions that explicitly read or write the sequencer 
status register are restricted to code running in supervi¬ 
sor mode. The sequencer status register is illustrated in 
figures 9 and 10. 

The flag/enable bits selectively control the interrupts. If 
an interrupt signal is asserted, or an internal exception 
occurs, its flag bit is set. If the relevant enable bit is set 
(and the master interrupt enable bit is also set), then 


an interrupt sequence also occurs. Interrupt-handling 
software can examine the flag bits to determine which 
interrupts have occurred. 

The men bit is the master interrupt enable. If men is 
false and an interrupt occurs, no interrupt routine will 
be called, but the associated exception flag will still be 
set. 

After a reset, .ssr is initialized with all zeroes except 
for the supervisor mode bit, which is set; and the .tos 
field, which is set to all ones to indicate an empty 
stack. 


31 0 


reserved 

b 

e 

z 

c 

.sar 

.fir 

19 

1 

1 

1 

5 

5 


Figure 7. Processor status register (.psr) 


Symbol 

Meaning 

.sar 

Shift amount register 

.fir 

Field length register 

z 

Register bank select (for .r28-.r31) 

c 

Carry bit 

be 

Reserved (should be set to zero) 

reserved 

Reserved (should be set to zero) 


Figure 8. Processor status register bit fields 


31 0 


■ tos 

D 


pag 

Bl 

OS 

EH 



QQ 

brk 

a 

a 


5 

1 

2 

3 

2 

2 

2 

2 

2 

2 

3 

2 

2 

2 


Figure 9. Sequencer status register 
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Symbol 

Bit # 

Name 

Meaning 

mal 

0 

malflg 

flag for misaligned data Interrupt 


1 

malen 

enable for misaligned data Interrupt 

b 

2 

b 

reserved. State must be preserved by the programmer 


3 

bi 

reserved. State must be preserved by the programmer 

s 

4 

s 

reserved. State must be preserved by the programmer 


5 

si 

reserved. State must be preserved by the programmer 

brk 

6 

brkfig 

flag for breakpoint interrupt 


7 

brkenc 

enable for code breakpoint Interrupt 


8 

brkend 

enable for data breakpoint interrupt 

tim 

9 

timfig 

flag for timer Interrupt 


10 

timer) 

enable for timer Interrupt 

SOV 

11 

sovfig 

flag for sequencer stack overflow Interrupt 


12 

soven 

enable for sequencer stack overflow Interrupt 

sun 

13 

sunflg 

flag for sequencer stack underflow Interrupt 


14 

sunen 

enable for sequencer stack underflow Interrupt 

trp 

15 

trpflg 

flag for trap instruction Interrupt 


16 

trpen 

enable for trap Instruction Interrupt 

prv 

17 

prvfig 

flag for privileged Instruction Interrupt 


18 

prven 

enable for privileged Instruction Interrupt 

fit 

19 

ext4flg 

flag for EXT4- Interrupt 


20 

ext4en 

enable for EXT4- Interrupt 

pag 

21 

ext23en 

enable for EXT2- and EXT3- Interrupts 


22 

ext2flg 

flag for EXT2- Interrupt 


23 

extSfig 

flag for EXT3- Interrupt 

ext 

24 

extifig 

flag for external Interrupt EXT1- 


25 

extien 

enable for EXT1- 

m 

26 

men 

master Interrupt enable 

.tos 

27-31 

tos 

five bit top of stack pointer 


Figure 10. Sequencer Status Register bit fields 


TIMER REGISTER AND INTERRUPT BREAKPOINT/WATCHPOINT REGISTER 


The processor includes a 32-bit timer register. This reg¬ 
ister is decremented every clock cycle. When the value 
becomes negative, the timer flag is set and a timer in¬ 
terrupt occurs. 

The timer continues to decrement when negative, al¬ 
lowing accurate timing even if the service routine is in¬ 
terrupted or delayed. The timer may only be set or 
read from supervisor mode. 


The 32-bit Breakpoint Register (.brk) is used to pro¬ 
vide a code breakpoint or data watchpoint for program 
debugging. Breakpoints and watchpoints are set by 
loading the register with the address to be monitored, 
and enabling the breakpoint or watchpoint interrupt 
enable bit. When the processor accesses the location 
being monitored, a breakpoint interrupt will occur. 


Interrupts 

The processor receives interrupts from four external 
sources: EXT1-, EXT2~, EXT3-, and EXT4~; and gen¬ 
erates seven interrupts internally: BRK, TIM, SOV, 
SUN, TRP, MAL, and PRV. When an interrupt control 
line or internal condition becomes active, the sequen¬ 
cer sets the corresponding .ssr interrupt flag. If the 
master interrupt enable bit of the .ssr is set, and the 


corresponding .ssr interrupt enable is active, the inter¬ 
rupt will be honored, as described below. 

There are sixteen interrupt vector addresses, divided 
into four classes: EXT1-, EXT2- or EXT3--, TRAP, and 
Others. There is an interrupt vector for every combina¬ 
tion of the four vectors. 
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Interrupts, continued 

INTERRUPT CONTROL LINES 


The external interrupt sources are: EXT1-, EXT2-, 
EXT3~ and EXT4-. Each has status and interrupt en¬ 
able bits in the .ssr. 

The interrupt mask bit for EXT1- is called extien, and 
its status bit is ext 1 fig. 

EXT2“ shares an interrupt enable bit, ext23en, with 
EXT3-. Its associated status bit is ext2flg. 

EXT3- shares an interrupt enable bit, ext23en, with 
EXT2-. Its status bit is ext3flg. 

EXT4- has an enable bit called ext4en. Its status bit is 
ext4flg. It is used as the floating point exception inter¬ 
rupt on the XL-8032 and XL-8064. 

EXCEPTION SOURCES 

There are seven internal exception sources: PRV, SOV, 
SUN, MAL, TRP, TIM and BRK. Each has a status and 
interrupt enable bit in the .ssr. 

PRV is set when an attempt is made to execute a privi¬ 
leged instruction while not in supervisor mode. Its in¬ 
terrupt enable and status bits are prven and prvfig, re¬ 
spectively. 


SOV and SUN indicate stack near-overflow and near¬ 
underflow. SOV occurs when data is pushed into the 
third-to-last available word on the stack. SUN occurs 
when the next-to-last valid data is popped off the 
stack. The enable and status bits for SOV and SUN are 
seven, sovfig, sunen, and sunfig. 

MAL is the misaligned data exception, which occurs 
when the data to be loaded or stored straddles a word 
boundary. Its enable and status bits are malen and 
malfig, respectively. Misaligned loads and stores can 
be corrected in a trap handler if code memory can be 
examined by the software. This is possible in some im¬ 
plementations. Otherwise, misaligned loads and stores 
are unrecoverable errors. 

TRP is set by invoking one of the trap instructions. Its 
enable and status bits are trpen and trpfig, respec¬ 
tively. Trap instructions are software interrupts. 

The remaining two exceptions, TIM and BRK, are set on 
timer interrupts and breakpoints/watchpoints, respec¬ 
tively. Their enable and status bits are timen, timfig, 
brkenc, brkend (for code and data breakpoints, re¬ 
spectively), and brkfig. 
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PROCESSOR CHIP SETS 

All XL-Series processors have an integer processing The XL-8032 has a 32-bit floating point unit, also in a 
unit (IPU) and a program sequencing unit (PSU). 144-pin PGA package. The XL-8064 has a floating 

Each of these units is a single CMOS device, in a point unit which comes in a 168-pin PGA package. 

144-pin PGA package. 



Figure 11. Block diagram of an XL-8000 system 
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Hardware Designer’s Description, continued 



Figure 12. Block diagram of an XL-8032 (32-bit FPU) system or an XL-8164 (64-bit FPU, 32-bit bus) system. 
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Figure 13. Block diagram of an XL-8364 system (64-bit FPU with a 64-bit data bus) 
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Signal Description, Buses 


There are two memory buses on the XL processors: 
one for code and one for data. Having separate code 
and data space gives the same bus bandwidth as a con¬ 
ventional Von Neumann machine running at twice the 
clock rate. Another bus, the OP bus, is used both for 
external I/O and to encode information about the ma¬ 
chine state that is useful to the data memory and inter¬ 
rupt systems. 

Both code and data buses have 32-bit addresses. The 
width of the data bus varies among the XL-Series proc¬ 
essors. The code word of the XL-8000 is 32 bits wide. 
The XL-8032 and XL-8064 have 64-bit code words to 
make room for the floating point instruction field. The 
XL-8000 and XL-8032 have 32-bit data words, while 
the XL-8064 has configurations for either a 32- or 
64-bit data word (the XL-8164 and the XL-8364 con¬ 
figurations). The XL-8364 configuration allows double¬ 
precision floating point words to be loaded or stored in 
a single bus transaction. 

AC BUS 

The AC31 0 Code Address Bus is driven by the se¬ 
quencer. It sends a 32-bit instruction address to the 
code memory. The code address is not latched by the 
sequencer, so an external address latch is necessary be¬ 
tween the AC bus and code memory. The sequencer is 
the only XL component that uses the AC bus. The 
high-order bits of the AC bus can be left floating if they 
aren’t going to be used. 

The AC bus produces code word addresses, not byte 
addresses. The code word size is 32 bits on the 
XL-8000 and 64 bits on the XL-8032 and XL-8064. 

AD BUS 

The ADai . o Data Address Bus provides addresses for 
data memory operations. It is also used for intra-proc¬ 
essor communication and communication with external 
hardware. It connects to the integer processor and se¬ 
quencer, but not to the floating point processor. It can 
also be used as a bidirectional data bus for transfers to 
and from other hardware. The AD bus is not latched, 
so an external address register is necessary between the 
AD bus and data memory. 


All 32 bits of the AD bus need to be attached between 
the sequencer and integer processor, but the high-or¬ 
der bits can be ignored by external memory if the full 
32 bits of memory space isn’t going to used. 

The address on the AD bus is a byte address. Since the 
integer processor and floating point unit always load 
full words, and the write-enable bits (WREN-) indicate 
which bytes are to be written, it isn’t necessary to con¬ 
nect AD 1..0 to memory or peripherals. 

C BUS 

The Csi^o or C63..0 Code bus is driven by the code 
memory with the 32- or 64-bit instruction word. The 
code word is latched by the processor at the rising edge 
of the clock. This bus provides instruction words for all 
components in the XL-Series processor. 

D BUS 

The D3 i_o ^03 0 Data bus is used as a bi¬ 
directional input/output bus. It transfers data words be¬ 
tween data memory and the processor. The integer 
processing unit always loads 32-bit words, but can store 
bytes or halfwords. The data is latched by the proces¬ 
sor. This bus is connected to the floating point proces¬ 
sor and integer processor, but not to the sequencer. 

The 64-bit configuration of the D bus can be used by 
the XL-8364 floating point processor to allow double¬ 
precision floating point words to be loaded or stored in 
one bus transaction. The integer processor accesses 
memory 32 bits at a time; only the floating point proc¬ 
essor can do 64-bit transfers. 

OP BUS 

The OP 4 0 output bus indicates the type of instruction 
that is executing, and can be used to control external 
hardware. The memory system decodes the OP bus 
outputs to determine when to read, when to write, and 
when to latch the data address. In addition, fifteen of 
the 32 OP combinations are used to signal loads or 
stores to “external registers” 0-14, which can be any 
external hardware. These external register transfers 
take place over the AD bus. The OP bus is on the 
XL-8136 sequencer. 
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Signal Description, Bus Control 

ABORT- 

ABORT- is a “not-ready” line for data memory. It is 
asserted by the data memory subsystem when the data 
at the requested address cannot be accessed on the 
next cycle. The XL components each cancel both their 
current and next instructions, and attempt to re-start 
the current instruction. The instruction will be re-exe¬ 
cuted the cycle after ABORT- is de-asserted. All the 
XL-Series components must have their ABORT- lines 
tied together. 

STALL- 

STALL- is a “not-ready” line for code memory. It is 
asserted by the code memory subsystem when the re¬ 
quested code word cannot be read in the current cycle. 

The XL components each cancel their currently fetch¬ 
ing instruction, and attempt to fetch it again on the 
next cycle. The instruction will be executed when 


STALL- is de-asserted. All the XL-Series components 
must have their STALL- lines tied together. 

OEA- OEAD-, OED-, OEX-, AND OEAC- 

OEAD-, OED-, and OEAC- are asynchronous output 
enable signals for the AD, D, and AC buses respec¬ 
tively. The buses are tri-stated when disabled. OEX- is 
the XL-3132 equivalent for OED-. OEA- is the 
XL-8137’s equivalent for OED-. These signals allow 
easy access to the code and data buses for cycle-steal¬ 
ing or DMA hardware. 

WREN- 

The WREN 3 . .0 signals are write-enables for each byte 
in the data word. The WREN- lines are driven when a 
store instruction is executed by the processor. 


Other Signals 

NEUT- 

NEUT- is a signal that goes from the PSU to the IPU 
and FPU. It is not normally used by hardware outside 
the processor chip set. NEUT- is asserted by the se¬ 
quencer, and instructs all XL components to cancel 
their current instructions. This is done on branches, 
calls, and interrupts to prevent the instruction in the 
pipeline from being executed. All XL-Series compo¬ 
nents must have their NEUT- lines tied together. 

EXT1-, EXT2-, EXT3-, and EXT4- 

Level-sensitive interrupt request lines. The current in¬ 
struction is allowed to complete (unless ABORT- is also 
asserted, in which case the instruction is canceled, and 
will be re-executed when the interrupt routine returns), 
and execution proceeds from one of the interrupt vec¬ 
tors. External interrupts can be enabled and disabled 
in the sequencer status register. Interrupt signals are 
examined only at the rising edge of the clock. 

EXT4- is used by the XL-8032 and XL-8064 as a float¬ 
ing point exception interrupt. 

COND 

Condition code signal. Goes from the IPU to the PSU. 
Not normally used outside the processor chip set. 


FPCN 

Floating point condition code. Goes from the FPU to 
the PSU. Not normally used outside the processor chip 
set. Tied to GND in the XL-8000, which does not have 
a floating point unit. 

ZERO 

XL-3132 zero condition output. Not used in XL. 
(Leave floating.) 

CLK 

The Clock signal, CLK, is a single-phase TTL-level 
clock signal. 

MDCLK 

The multiply/divide clock signal, MDCLK, is a single¬ 
phase TTL-level clock signal. This signal must be syn¬ 
chronized to the rising and falling edges of the CLK 
signal, and runs at twice the frequency of CLK. 

SUP 

An output that indicates that the processor is in super¬ 
visor mode. Can be used to implement protected mem¬ 
ory. 
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Other Signals, Continued 


RESET- 

A level-sensitive input that resets the sequencer and 
causes a branch to address 0. The stack pointer is in¬ 
itialized to 31 (empty stack), and supervisor mode is 
set. The other registers in the sequencer are undefined. 
Reset is not useful as a non-maskable interrupt. 

After the power-up reset, only the program counter 
and the sequencer status register are defined. All other 
register contents are undefined. Registers that can 
cause exceptions (such as the timer and breakpoint 
registers) must be initialized before their exceptions are 
enabled. 


all XL configurations. Note that this is not the same 
signal as OEA-. 

VCC AND GND 

VCC is a +5.0 volt supply. GND is a system ground. All 
VCC and GND pins must be connected—floating pins 
are not allowed. 

NC 

No connection. Reserved for future expansion. 


OEA+ 

A signal on the XL-8137 that causes the ALU result to 
be driven onto the AD bus on every cycle. Tied low in 
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A simple code memory system is shown in figure 14. 
The code address comes out the AC bus, is latched by 
a set of 373-type latches, and fed into a 32- or 64-bit¬ 
wide array of ROMs or static RAMs. The output of the 
memory devices is driven onto the C bus. 

More complex memory systems—such as cached 
DRAM or static column DRAM—won’t always have 
code ready at the end of a cycle, so the STALL- line 
has been provided for memory handshaking. Asserting 
STALL- will cause the code fetch cycle to be retried on 
the next clock cycle. STALL- can be asserted for as 
many cycles as necessary to retrieve the code word. 

Memory faults such as accesses to non-existent mem¬ 
ory or virtual memory page faults can be corrected by 
asserting STALL- and an interrupt at the same time. 
The interrupt takes precedence over the stall, so the 
interrupt routine can take corrective action and return, 
and the stalled instruction will be tried again. 



Figure 14. Simple code memory system 


Data Memory System 


A simple data memory system is shown in figure 15. 
Data addresses are driven onto the AD bus by the proc¬ 
essor, latched by a set of 374-type registers, and fed 
into a 32-bit-wide array of static RAMs. The output of 
the RAMs is driven onto the D bus. More complex 
memory systems include DRAMs with a static RAM 
cache, and multiple banks of DRAM. 

The OP bus is decoded to determine what operations 
are taking place during the cycle. An address opera¬ 
tion, a read, a write, or an address operation plus a 
read can take place during a single cycle. The decoded 
OP bus is used to drive the read/write and output en¬ 
able lines on the data RAMs, and as an input to the 
clock generation logic, which needs to refrain from 
clocking the external address register under certain 
conditions. 

If the memory system is not going to be ready in time, 
ABORT- is used as a data memory handshake signal. 
ABORT- can be held for any length of time. 

Unlike STALL-, ABORT- causes the system to back up 
and re-execute the aborted instruction when it is re¬ 
leased. This allows the failed bus transfer to be re-exe- 
cuted. The external address register should not be 
clocked during an ABORT- sequence. 



D 

BUS 

(to PSU, FPU) 


Figure 15. Simple data memory system 
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OP Output Bus 

The OP Bus is available to control external registers turn from interrupt, data load, data store, and I/O 
and transceivers. Its encoding reports the following ac- through external registers 0-14. 
tions: address generation, interrupt acknowledge, re- 


I/O System 

Data can be transferred to external devices over the 
AD bus at the rate of one 32-bit word per clock cycle. 
Such transfers are signaled with the OP bus codes 
10000-11110, which select “external registers” 0-14. 
The OP bus code is used to select the external device. 
Each external register should be associated with one 
data direction, since there is no data direction signal 
for OP bus transfers. External I/O is performed in as¬ 
sembly language with the input and output instructions. 


which transfer one of the 32 data registers over the AD 
bus. 

I/O can also be memory-mapped. External DMA can 
be performed by tri-stating the buses and asserting 
ABORT-; at which point data memory can be taken 
over for any length of time. External code memory ac¬ 
cess can be performed by asserting STALL- and 
OEAC-. 


Interrupt System 

An external interrupt will only be acknowledged if both 
the master interrupt enable bit and the individual inter¬ 
rupt enable bit are set on the cycle the interrupt is as¬ 
serted. Interrupts are level-sensitive and synchronous, 
and are read at the rising edge of the clock. 

INTERRUPT SEQUENCE 

When an interrupt is detected, the processor decides 
whether to allow the current instruction to proceed. 
This decision is based upon the state of the ABORT- 
signal. 

The current instruction is allowed to complete if the 
ABORT- signal is not asserted; it is canceled otherwise. 


This allows the system to re-execute the current in¬ 
struction after returning from the interrupt. 

The processor then enters supervisor mode, neutralizes 
the fetched instruction, saves state information in .iea, 
.ifa, and .ssr, and branches to the interrupt vector ad¬ 
dress. 

Interrupts can be nested to any depth by saving the 
processor state and re-enabling interrupts. 

To return from an interrupt, two special interrupt re¬ 
turn instructions must be executed, return-from-inter- 
rupt-0 (rfiO) and return-from-interrupt-l (rfi1). 


Power-up and Initialization 

On power-up, the state of the processor is undefined. 
RESET- should be held while the the system is powered 
up, then released. RESET- is a level-sensitive, synchro¬ 


nous signal that is sampled on the rising edge of the 
clock. When RESET- is asserted, a branch to absolute 
address zero occurs. 


©1988 WEITEK Corporation 
All rights reserved 


22 



XL-SERIES 

OVERVIEW 


PRELIMINARY DATA 

April 1988 _ 

Timing 

The figures below give cycle-by-cycle timing for typical 
bus operations. Both code and data memory systems 
use overlapped address generation and loads. This 
memory pipeline allows the system to use slower RAMs 
without a performance penalty. Loads on both memory 
systems occur at the rate of one per cycle. Stores on 
the data memory system can occur at the rate of one 
every two cycles. I/O transfers using the input and out¬ 
put instructions can take place every cycle. 



OP bus values are encoded to show an external register 
address or a combination of address generation, load/ 
store, interrupt acknowledge, and similar operations. 
There is also a default encoding that occurs when none 
of the other conditions apply. These conditions are 
shown in the OP bus entries in the timing diagrams. 
The OP bus bit encodings are given in the XL-8136 
data sheet. 
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Figure 17. Data memory timing—loads. “ADDRESS REGISTER” is the external data address register 
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Timing, continued 
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Figure 18. Data memory timing—stores. “ADDRESS REGISTER” is the external data address register 
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Figure 19. External I/O timing (for data transfers over the AD bus) 



Figure 20. Interrupt sequence. (“INST” identifies which instruction is being executed.) 
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XL-SERIES 

OVERVIEW 


Instruction Set Summary 


PRELIMINARY DATA 

April 1988 


XL-SERIES CONTROL INSTRUCTIONS 


br 

Branch 

brp 

Branch and Pop 

brstkp 

Branch to Stack and Pop 

bsr 

Branch to Subroutine 

bsrstk 

Branch to Subroutine from Stack 

cont 

Continue 

endloop 

Conditional End of Loop 

loop 

Enter Loop 

ovneut 

Override Neutralization 

ovneuti 

Override Neut. and Increment Stack 

pops 

Pop from Sequencer Stack 

pushs 

Push onto Sequencer Stack 

revneut 

Reverse Neutralization 

rfiO 

Return from Interrupt 0 

rfil 

Return from Interrupt 1 

rts 

Return from Subroutine 

seq 

Sequencer Housekeeping Instruction 

shbr 

Branch (short form) 

shsob 

Sob (short form) 

sob 

Subtract One and Branch 

trap 

Trap 

trapb 

Trap and Back Up 

trapi 

Trap Immediate 


XL-SERIES INTEGER INSTRUCTIONS 


add Signed Add 

adda Add Address 

addai Add Address Immediate 

addam Add Register .am 

addami Add Register .am Immediate 

addamis Add .am Immediate plus Sign 

addc Signed Add with Carry 

addi Signed Add Immediate 

addilO Signed Add 10-bit Immediate 

addr Generate Address (no increment) 

addr+ Indexed Addressing (post-increment) 

+addr Indexed Addressing (pre-increment) 

addrd Generate Address with Displacement 

addshft Shift and Add 

align Byte Align for Load 


and 

Logical AND 

asrtadr 

Put Address Register on AD Bus 

bmerge 

Bitwise Merge 

cir 

Clear 

dep 

Deposit 

div 

Divide 

ext 

Extract 

ffo 

Find First One (Priority Encode) 

input 

Input from AD Bus 

Idamal 

Load Multiply Result Registers 

load 

Load from Memory 

mer 

Merge Bit Fields 

meri 

Merge Immediate 

mov 

Move 

nnovi 

Move Halfword Immediate 

movih 

Move Immediate High 

mpy 

Multiply 

nand 

Logical NAND 

neg 

Signed Negate 

nop 

No Operation 

nor 

Logical NOR 

not 

Logical NOT 

or 

Logical Or 

output 

Output to AD Bus 

pexch 

Perfect Exchange 

rapsr 

Restore .adr and .psr 

salign 

Signed Byte Align 

set 

Set to Ones 

setsema 

Set Semaphore 

srpsr 

Save and Restore .psr 

store 

Store to Memory 

sstore 

Signed Store to Memory 

sub 

Signed Subtract 

suba 

Subtract Address 

subc 

Signed Subtract with Carry 

subai 

Subtract Address Immediate 

subi 

Signed Subtract Immediate 

swap 

Save .psr and Swap Banks 

uadd 

Unsigned Add 

uaddc 

Unsigned Add with Carry 

usub 

Unsigned Subtract 

usubc 

Unsigned Subtract with Carry 

xnor 

Logical XNOR 

xor 

Logical XOR 
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Instruction Set, continued 


XL-8032 INSTRUCTIONS 

The XL-8032 can execute all of the XL-Series control 
and instructions, and has the following instructions as 
well: 


fadd 

Floating Point Addition 

fabs 

Floating Point Absolute Value 

fbr 

Floating Point Branch 

fclr 

Clear Floating Point Register 

fcisr 

Clear Floating Point Status Register 

fix 

Float-to-Fix Conversion 

fload 

Load Floating Point Data 

float 

Fixed-to-Float Conversion 

flut 

Read Floating Point Look-up Table 

fmac 

Multiply-Accumulate 

fmode 

Set Floating Point Mode 

fmov 

Copy Floating Point Register 

fmul 

Floating Point Multiplication 

fstore 

Floating Point Store 

fstsr 

Store Floating Point Status Register 

fsub.fsubr 

Floating Point Subtraction 

XL-8064 INSTRUCTION SET 

The XL-8064 

can execute all XL-Series control and 

integer instructions. The following is a partial list of the 
XL-8064 instructions: 

dfabs 

Double Absolute Value 

dfadd 

Double Floating Add 

dfcmp 

Double Floating Compare 

dfcnvt 

Convert Double to Single 

dcnvtf 

Convert Single to Double 

dfdiv 

Double Floating Divide 

dfix 

Double-Precision to Integer (trunc.) 

dfixr 

Double-Precision to Integer (round) 

dfloat 

Fixed to Double-Precision 


dfmov 

Copy Floating Point Register 

dfmul 

Double Floating Multiply 

dfneg 

Double Floating Negate 

dfsqrt 

Double Floating Square Root 

dfsub 

Double Floating Subtract 

dfsubr 

Reverse Double Floating Subtract 

dioad 

Double-Precision Load 

dioadi 

Double-Precision Load L.S. Data 

dioadm 

Double-Precision Load M.S. Data 

dstore 

Double-Precision Store 

dstorel 

Double-Precision Store, Least- 
Significant Word 

dstorem 

Double-Precision Store, Most- 
Significant Word 

fadd 

Floating Point Addition 

fabs 

Floating Point Absolute Value 

fbr 

Floating Point Branch 

fclr 

Clear Floating Point Register 

fcisr 

Clear Floating Point Status Register 

fcmp 

Floating Point Compare 

fdiv 

Floating Point Divide 

fix 

Float-to-Fix Conversion 

fixr 

Float-to-Fix Conversion (round) 

fload 

Load Floating Point Data 

float 

Fixed-to-Float Conversion 

fmov 

Copy Floating Point Register 

fmul 

Floating Point Multiplication 

fsqrt 

Floating Point Square Root 

fstore 

Floating Point Store 

fstsr 

Store Floating Point Status Register 

fsub.fsubr 

Floating Point Subtraction 

fdcnvt 

Convert Single to Double 

max 

Maximum of Two Values 

min 

Minimum of Two Values 
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XL-SERIES 

OVERVIEW 


Physical Dimensions 


PRELIMINARY DATA 

April 1988 


K-Ai 


Figure 21. Physical dimensions for all XL-Series devices except the XL-8364 floating point unit 
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SIDE VIEW 


TOP VIEW 


Symbol 

LIMITS 

INCHES MM 

MIN_MAX_MIN_MAX_ 

A1 

.080 t .008 

2.032 t 0.203 

A2 

.180 tVD. 

4.572 tVD. 

A3 

.050 

1.270 

D 

1.575 sa. t .016 

40.005 sa. t 0.406 

E1 

1.400 sa. t .012 

35.560 sa. t 0.305 

E2 

.050 dia. tvD. 

1.270 dia. tVD. 

E3 

.018 t .002 


d 

.070 dia. tvD. 

1.778 dia. tVD. 

e 

.100 typ. 

2.540 typ. 
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BOTTOM VIEW 



STAND ' 
OFF . 
KOVAR , 




SIDE VIEW 


TOP VIEW 


Symbol 

DIMENSIONS 


INCHES 

MM 

A1 

0.095 + 0.013 

2.41 + 0.33 

A2 

0.180 typ. 

4.57 typ. 

A3 

0.050 tVD. 

1.27 tvp. 

— 

D 

1.750 sq.+ 0.022 

44.5 sq.+ 0.56 

El 

1.600 sq. + 0.016 

40.6 sq.+ 0.41 

E2 

0.050 dia. typ. 

1.27 dia. typ. 

E3 

0.018 +0.002 

.46 + 0.05 

d 

0.065 dia. typ. 

1.65 dia. typ. 

e 

0.100 typ. 

2.54 typ. 


Figure 22. Physical dimensions for the XL-8364 floating point unit 
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Ordering Information 


DEVICES 

PACKAGE TYPE 

SPEED 

TEMPERATURE RANGE 

ORDER NUMBER 

2 

144-Pin Grid Array 

120 ns 

Tc = 0-85 OC 

XL-8000-120-GCD 

2 

144-Pin Grid Array 

100 ns 

Tc = 0-85 OC 

XL-8000-100-GCD 


Figure 23. Ordering information for the XL-8000 


DEVICES 

PACKAGE TYPE 

SPEED 

TEMPERATURE RANGE 

ORDER NUMBER 

3 

144-Pin Grid Array 

120 ns 

Tc = 0-85 OC 

XL-8032-120-GCD 

3 

144-Pin Grid Array 

100 ns 

Tc = 0-85 OC 

XL-8032-100-GCD 


Figure 24. Ordering information for the XL-8032 


DEVICES 

PACKAGE TYPE 

SPEED 

TEMPERATURE RANGE 

ORDER NUMBER 

3 

144-Pin Grid Array 

120 ns 

Tc = 0-85 OC 

XL-8164-120-GCD 

3 

144-Pin Grid Array 

100 ns 

Tc = 0-85 OC 

XL-8164-100-GCD 


Figure 25. Ordering information for the XL-8164 


DEVICES 

PACKAGE TYPE 

SPEED 

TEMPERATURE RANGE 

ORDER NUMBER 

2 

144-Pin Grid Array 

120 ns 

Tc = 0-85 OC 

XL-8364-120-GCD 

1 

168-Pin Grid Array 




2 

144-Pin Grid Array 

100 ns 

Tc = 0-85 OC 

XL-8364-100-GCD 

1 

168-Pin Grid Array 





Figure 26. Ordering information for the XL-8364 
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XL-SERIES 

OVERVIEW 


PRELIMINARY DATA 

April 1988 





A 

For additional information on WEITEK products, please fill out the form below and mail. 


Name Title 



Company 

Phone 


Address 



Comments 

I am currentlv involved in a design with the following Weitek products 
design data base to insure that I receive status updates. 


and wish to be added to vour 


APPLICATION: 

□ ENGINEERING WORKSTATIONS □ SCIENTIFIC COMPUTERS 

□ GRAPHICS □ OTHER _ 

□ PERSONAL COMPUTERS 

Check the products on which you wish to receive data sheets: □ Have a sales person call 


ATTACHED PROCESSORS 

COPROCESSORS 

BUILDING BLOCKS 


□ XL-SERIES OVERVIEW 

□ 1167 

□ 2264/2265 

□ 2010 


□ 1164/1165 

□ 3132/3332 

□ 2245 


□ 3164/3364 

□ 1232/1233 

□ 2516 



□ 1066 

□ 2517 

WEITEK use: Rec’d 

Out 

TPT 

Source: DS 


Status 


WEITEK XL-SERIES OVERVIEW 

Please Comment On The Quality Of This Data Sheet. 

Even though we have tried to make this data sheet as complete as possible, it is conceivable that we have 
missed something that may be important to you. If you believe this is the case, please describe what the 
missing information is, and we will consider including it in the next printing of the data sheet. 


Fold, Staple and Mail to Weitek Corp. 


NO POSTAGE 
NECESSARY 
IF MAILED 
IN THE 

UNITED STATES 


BUSINESS RB>LY MAIL 

FIRSTCLASS PERMIT NO. 1374 SUNNYVALE. CA 


POSTAGE WILL BE PAID BY ADDRESSEE 

WEITEK Corporation 
1060 E. Arques Ave. 

Sunnyvale, CA 94086-BRM-9759 


ATTN: EdMasuda 









WEITEK A 



WEITEK’S CUSTOMER COMMITMENT: 

Weitek’s mission is simple: to provide you with VLSI solutions 
to solve your compute-intensive problems. We translate that 
mission into the following corporate objectives: 

1. To be first to market with performance breakthroughs, allow¬ 
ing you to develop and market systems at the edge of your art. 

2. To understand your product, technology, and market needs, so 
that we can develop Weitek products and corporate plans that 
will help you succeed. 

3. To price our products based on the fair value they represent to 
you, our customers. 

4. To invest far in excess of the industry average in Research and 
Development, giving you the latest products through techno¬ 
logical innovation. 

5. To invest far in excess of the industry average in Selling, Mar¬ 
keting, and Technical Applications Support, in order to pro¬ 
vide you with service and support unmatched in the industry. 

6 . To serve as a reliable, resourceful, and quality business part¬ 
ner to our customers. 

These are our objectives. We’re committed to making them 
happen. If you have comments or suggestions on how we can 
do more for you, please don’t hesitate to contact us. 



Headquarters 

Weitek Corporation 
1060 E. Arques Avenue 
Sunnyvale, CA 94086 
TWX 910-339-9545 
WEITEK SVL 
FAX (408) 738-1185 
TEL (408) 738-8400 


Domestic Sales Offices 
Weitek Corporation 
1060 E. Arques Avenue 
Sunnyvale, CA 94086 
TWX 910-339-9545 
WEITEK SVL 
FAX (408) 738-1185 
TEL (408) 738-8400 


Knox Trail Office Building 
2352 Main Street 
Concord, MA 01742 
TWX 910-380-7101 
FAX (617) 897-6729 
TEL (617) 897-3252 


European Sales Headquarters 

Greyhound House, 23/24 George St. 
Richmond, Surrey, TW9 IJY 
England 

TELEX 928940 RICHBIG 
FAX 011-441 940 6208 
TEL 011-441 5490164 


Japanese Representative 

C. Itoh Techno/Sciences 
Company Ltd. 

C. Itoh Building 
2-5-1 Kita-Aoyama 
Minato-Ku, Tokyo 107 
TELEX 781242 3240 
FAX (81) 3-497-4879 
TEL (81)3-497-4975 




